| Full Name | Matriculation No. |
|---|---|
| Chen Anqi | A0188533W |
| Heng Tze Ying Faith | A0173090L |
| Phua Jun Yuan Ryan | A0182724A |
| Soon Yoke Sze Cheryl | A0204097H |
Despite the Singapore government’s policies to reduce the number of vehicles, there are nearly one million vehicles on Singapore’s roads (as of 2019). The current situation Singapore faces is that roads are constantly packed with cars, especially during peak periods, and carpark spaces inevitably end up being crowded as well.
Drivers often commute to and fro their workplace, go shopping at supermarkets, go for exercises at sports centers, or to whichever destination they would like to reach on a daily basis. Sometimes, drivers may choose to plan their trips prior to leaving their house. This “planning” stage involves searching up online to first find the nearby carparks around their destination. They may also want to take a second factor into consideration: the different parking rates at different carparks, and compare them amongst those carparks previously identified to save cost. Drivers would then head out to reach their desired carparks near their destinations.
However, even with proper planning done in the steps above, there is always the problem of drivers reaching their destinations only to find themselves stuck in a long queue to enter the carpark nearest their destination, or not being able to find a place to park inside popular carparks. This is due to the lack of information in terms of not knowing the number of available parking lots prior to reaching their destination. Without having a place to park, drivers end up finding themselves in the situation whereby they have to loiter around and wait until someone leaves before a parking lot becomes available. This amount of time wasted leads to a huge frustration being built up in drivers if they have to wait for a long period.
Many a time, this problem is rampant not because Singapore experiences a lack of parking lot capacity, but rather because drivers are not aware of nearby alternative parking spaces with much more available parking lots that they could possibly park at instead of choosing to park at the popular carparks.
Hence, our team has come up with a solution through the provision of an app whereby users can input their destination location and view nearby carparks, as well as its availability data. Our app would serve as a one-stop source of information that contains the locations, parking rates, as well as real-time parking lots availability in various carparks near the user’s chosen destination. Besides, considering that people would often want to explore the area around their destination, our app also provides information on nearby facilities such as supermarkets, shopping malls, hawker centers, etc.. Moreover, with our app’s powerful “carpark availability prediction” functionality, drivers will be able to properly plan their trip beforehand to minimise waiting time, instead of physically driving around in an attempt to scout for carparks with available lots when they reach their destination.
There has been a new ruling announced by the LTA recently whereby visitor parking lots would be phased out in all new condominiums in Singapore (Ng, 2020). While members of the public will not be allowed to park at private carparks (condominiums/landed properties), they are still allowed to park in public carparks such as HDB and shopping mall carparks. Hence, we will focus on the real-time availability of HDB carparks as well as the carparks in Singapore’s main Heartland malls.
Our primary target audience are drivers, specifically car drivers.
One potential stream of revenue could come from advertisements within the app.
Given that many shopping malls hold periodic parking promotions, we could partner with these malls to advertise the promotions. This would increase the footfall and encourage spending in those malls, thereby generating both revenue and awareness for the mall.
Furthermore, we can also partner with businesses who would like to reach out to car owners, such as car servicing or car accessory companies. We can embed their links into the app, such that users will be directed to their websites when clicked. Since our platform provides a very targeted means of allowing our partners to reach car owners, it would be beneficial for them to advertise with us. Through this partnerships, we can bill our partners based on the engagement rate as well as per number of clicks on the advertisement.
We will first obtain data such as carpark locations and rates, public facilities, etc. from various data sources online, and clean the data as needed, before we input it into our main server. Depending on where the user is heading to, relevant data can be selected and displayed through the app whereby the user can choose his destination to park. The application will be real-time whereby users will be able to get most updated information on the availability of each car park and hence plan their route.
Real-time lot availability at HDB carparks was retrieved from Data.gov.sg. The dataset is refreshed every minute, and the date_time parameter was used to retrieve the latest carpark availability at that moment in time.
Real-time lot availability for hourly parking at Capitaland Malls was retrieved from Capitaland website.
Furthermore, Carpark Rates (Major Shopping Malls, Attractions and Hotels) information was retrieved from Data.gov.sg in CSV file format: https://data.gov.sg/dataset/carpark-rates. Then, they were geocoded to find the respective coordinates, which was written to carpark_rates.csv.
Real-time 2-hour weather forecast among the town areas in Singapore was retrieved from Data.gov.sg. The dataset is refreshed every 30 minutes, and the date_time parameter was used to retrieve the latest weather condition at that moment in time.
2-Hour Weather Forecast: https://data.gov.sg/dataset/weather-forecast
To further differentiate Singapore into their town area geographical boundaries by plotting a border view of the map, we used the file “singapore-towns.zip” from Lecture 8 (Advanced Data Visualisation) course material and utilised the relevant files inside to do so.
Datasets used to fill our destination selector checkboxes were retrieved from multiple sources, then written to separate CSV files. Finally, the CSV files were combined into one locations.csv file containing all the locations in Singapore by their category.
The app would serve as a platform whereby users can input their destination location and view nearby carparks, parking rates, as well as their availability lots.
Besides this, users are also given the option to find out the nearest location of a facility i.e supermarkets, medical clinics, hawker centres, hotels from their destination location. The app will display the distance (as well as the time) needed for them to reach the nearest facility. Additionally, users can check the weather forecast of the area that they are heading to for the next 2 hours so that they can plan their route.
One powerful functionality our app provides for pre-departure route planning is predicting the lots availabilty at a carpark, extrapolated from historical data. This feature takes into consideration the forecast weather condition at the destination, and the day of a week the driver plans to travel on, and draws conclusion based on analysis of historical weather and carpark availability data.
The app UI comprises of 4 main tabs:
The shinydashboard package was used to create a dashboard-like interface for the app, with a sidebar menu of the 4 tabs. Drivers are given the option of choosing either the “Reaching Destination” tab or the “Planning Before Departure” tab and make use of our app with a few simple clicks. Additionally, drivers can find out the weather forecast of Singapore’s town area that they are heading to with an accuracy of up to 2 hours by simply selecting the area in the “2-Hour Weather Forecast” tab. The utilisation of separate tabs based on its functionality as well as selection inputs helps to provide a clean and friendly user-interface for our users to get information pertaining to their journey.
UI of First Tab: Reaching Destination
The first “Reaching Destination” tab is for drivers who are on the road, and are reaching their target destination soon. Drivers can input their destination and use the app to search for nearby carparks. Depending on their preference, they can select the carpark types to include in the search. The app will return the nearest few carparks and the distance from their destination. Additionally, they may also explore locations of interest nearby their destination. Furthermore, information such as the carpark rates and real time carpark lot availability will also be shown.
This tab was developed by first sourcing the location datasets online, and geocoding them to input their location parameters into the map.
We scraped and created functions (“getRealTimeCarparkAvailability(), getCapitalandRealTimeLots()”) to get the real time Capitaland Shopping Malls and HDB Carpark Availability lots using API data from the relevant websites. We also created functions (“findNearest()” and “displayNearest()”) to find and display the nearby carparks/facilities from the user’s destination input, and the distance between user’s inputs using the distm() function together with the which.min() function.
The reason why we developed this tab is because we felt that with these information, our users will be able to plan where to park more efficiently. They would also be able to decide on their next location easily if they are done with their task at their first input destination, as our app provides a comprehensive list of nearby location categories (shopping malls, hawker centres, sport facilities, etc.) that users can choose from if they would like to explore around / go somewhere else before heading back home.
UI of Second Tab: Planning Before Departure
The development of the second “Planning Before Departure” tab looks similar to the first tab, but with the added input of starting destination, so drivers can plan their trip ahead of departure. The time taken to drive from their starting point to destination will be displayed, along with the distance between the 2 locations. This was done by creating the function (“findDistance()”) that primarily uses the function gmapsdistance(), which uses the Google Maps Distance Matrix API to compute the distance and driving time between 2 points.
In addition, we also displayed the real time capitaland malls carpark availability and projected HDB carpark availability if the user were to leave his/her destination at the point of search and factored in the the travelling time taken. We created the functions (“getHistoricalWeather()”, “categorizeWeather()”, “categorizeRainfall()” and “getPredictedCarparkAvailability()”) that uses the historical data of the past 4 weeks (for demo purposes, but could be further extended), corresponding to the same day, time and weather conditions when the user entered an input, and took the average of the available carpark lots to get the predicted lot number in that specific carpark. When analysing the weather conditions, we looked into rainfall data in particular, because we felt that in Singapore, rain is the most important factor that affects people’s travel options and in turn the carpark avaiability.
The reason why we developed this tab is because we felt that users would be interested to know the driving time taken to reach their destination from their current location. Moreover, we feel that using historical data to match the conditions of the user’s input parameters and getting the predicted lot number would provide a good estimate for them to plan which carparks would likely have an available parking lot by the time they arrive, prior to leaving their current location.
UI of Third Tab: 2-Hour Weather Forecast
The third tab “2-Hour Weather Forecast” displays the 2-hour weather forecast for each town area in Singapore. This tab was developed by first crawling the 2 hour weather forecast API data that gets updated every half-hourly from the relevant website. After manipulating the API dataset, we created a function (“getPredictedWeather()”) got the names of all Singapore’s town area, their geographical location within Singapore, as well as the 2 hour weather forecasted status.
We plotted the town areas into the map but realised that the town geographical locations plotted were only shown as single-points on the map, which is inaccurate as they should instead be covering a geographical area. Hence, to plot the border view of the town areas so that users would know which area in the map corresponds to which town area, we used the function readOGR() to read the geospatial vector data format provided in the “singapore-town.zip” file from Lecture 8’s course material. We then used the function spTransform() with Singapore’s Coordinate Reference System of WGS84 as an input parameter to read the file and get the geographical borders of each town area in Singapore. Finally, we added the polygons and used the colorFactor() function to colour each town area polygon differently on the map.
From there, we added weather icons and popups to display the forcasted weather as well as the town names into the map to provide a clear and aesthetically-pleasing overview of the weather forecast status throughout Singapore. The map would be zoomed into a specific town area if users were to select an area using the dropdown option, and they would be provided with a text message on their forecasted weather of their desired area.
The reason why we developed this tab is because we felt that providing information on the forecasted weather will further enhance users’ knowledge and help our users plan to their journey even better. For example, they can expect that their journey may take longer if it is going to rain. They can also better prepare themselves by ensuring that they bring an umbrella out. They may also choose to park at shopping malls instead of HDB carparks as HDB parking lots might be unsheltered, and that there might not be sheltered walkways for them to walk from the carpark to their destination.
The fourth “About This App” tab provides basic information about the app, as well as clear guidelines on how to use it.
We have managed to find two major availability datasets - one on HDB carparks and one on Capitaland mall carparks. However, for other carparks such as the carparks managed by Urban Redevelopment Authority (URA), real time availability information is currently not readily available online.
The app calculates straight line distance from destination to nearby carparks based on coordinates, and this may not be the most accurate, given that there might be obstructions along the path such as buildings or construction sites that makes the path unaccessible. Furthermore, should the nearest supermarket (or any relevant information) be located in the shopping mall itself, for example, the coordinate data may be different and hence the app may display a small distance such as 30m. It does not show that it is inside the shopping mall, and that could be a future plan to work on. However, we believe that the distance reported in the application would still be useful in providing a good estimate to users.
Though we used reactive functions to optimize the server, our app could take a few seconds to plot data when too many nearby location checkboxes are selected simultaneously. This is because a huge amount of computation is occurring behind the scene to calculate and compare the distances, query real-time data APIs, as well as perform comprehensive analysis of historical data. In the future, we could explore on the possibility to parallelize the process, or algorithmically improve the complexity of the code so that it would respond faster. In the current stage, we have been focusing on the accuracy of the output.
Throughout the development of our project, there were occasional situations at late night that we encountered issues in running the application even though it functioned perfectly well throughout the day with the same set of codes used. Although seldom faced, this issue occurred mostly during the wee hours, typically past 12am, and it could be due to a fault in the API data servers from which we obtained real-time weather and availability data. Hence, there might be a limitation in the functionality of our app past midnight. However, we reckon that this issue is not severe, considering that our users would generally travel out during daytime.
A possible improvement to the app is to include petrol price information. There are many petrol stations in Singapore, and each petrol brand offers different prices. Hence, including real-time petrol price information could be very helpful to drivers, where they can choose the petrol station with the cheapest price. As petrol is priced according to a dynamic pricing station where prices fluctuate everyday, we will have to reach out to the different petrol brands in Singapore to obtain real-time prices. Furthermore, we can also include locations of ERPs so that drivers will be able to plan their routes to avoid them or travel at timings where the ERPs are not activated.
Through our app, we hope to be a one-stop-shop platform where drivers will be able to access carpark locations, real-time carpark availability, parking rates, and other relevant details (such as nearby facilities and weather forecast) with ease, solving the common problems and frustrations they face. Although there are a few limitations to our app, we feel that it has the necessary features that drivers require, as we designed the app considering what we would want as drivers ourselves.
Ultimately, we hope that our app will be a consolidated app that supports all carpark-related data for drivers in Singapore, and help them make more informed choices among various car parking alternatives.
Ng, K. C. (2020, June 1). Oh No, No More Visitor Parking In All New Condominiums in Singapore! Retrieved from https://www.homequarters.com.sg/2020/06/01/oh-no-no-more-visitor-parking-in-all-new-condominiums-in-singapore/
Codes used by the team for data scraping, data processing, data cleaning and geocoding locations can be found below.
Libraries required:
library(rvest)
library(dplyr)
library(ggmap)
library(xml2)
library(sf)
library(tidyverse)
library(stringr)
ggmap::register_google(key = 'AIzaSyCmxg0cwxF7stNPFUhhaoenc_V8plhi4rA')
attractions.csv
url <- "https://www.makemytrip.com/travel-guide/singapore/places-to-visit.html"
attractions <- read_html(url) %>% html_nodes(".titleClass") %>% html_text()
attractions_df <- data.frame(name=attractions, address=paste("Singapore", attractions)) %>% mutate_geocode(address) %>% select(-address)
write.csv(attractions_df, "data/data-processed/attractions.csv", row.names = F)
bus.csv and mrt_lrt.csv
mrt_page <- read_html("https://en.wikipedia.org/wiki/List_of_Singapore_MRT_stations")
mrt <- mrt_page %>% html_nodes("h2+ .wikitable td:nth-child(3)") %>% html_text() %>% gsub("\\s•.*", "", .) %>% gsub("\\[.*?\\]", "", .)
mrt <- mrt[!grepl("reserved for a possible future station", mrt, fixed=T)]
mrt <- paste(mrt, "MRT Station")
mrt_df <- data.frame(name=mrt, address=paste("Singapore", mrt)) %>% mutate_geocode(address) %>% select(-address) %>% na.omit()
lrt_page <- read_html("https://en.wikipedia.org/wiki/List_of_Singapore_LRT_stations")
lrt <- paste(lrt_page %>% html_nodes("td:nth-child(3) a , .wikitable td:nth-child(2) a") %>% html_text(), "LRT Station")
lrt <- lrt[lrt != " LRT Station"]
lrt_df <- data.frame(name=lrt, address=paste("Singapore", lrt)) %>% mutate_geocode(address) %>% select(-address)
bus_page <- read_html("https://en.wikipedia.org/wiki/List_of_bus_stations_in_Singapore#Depots_and_bus_parks")
bus_interchange <- bus_page %>% html_nodes(".column-width:nth-child(15) a") %>% html_text()
bus_interchange_df <- data.frame(name=bus_interchange, address=paste("Singapore", bus_interchange)) %>% mutate_geocode(address) %>% select(-address)
bus_terminal <- bus_page %>% html_nodes("p+ .column-width li") %>% html_text()
bus_terminal_df <- data.frame(name=bus_terminal, address=paste("Singapore", bus_terminal)) %>% mutate_geocode(address) %>% select(-address)
mrt_lrt <- rbind(mrt_df, lrt_df)
write.csv(mrt_lrt, "data/data-processed/mrt_lrt.csv", row.names = F)
bus <- rbind(bus_interchange_df, bus_terminal_df)
write.csv(bus, "data/data-processed/bus.csv", row.names = F)
carpark_rates.csv
# Dataset from data.gov.sg
carpark_rates <- read.csv("data/data-original/carpark-rates.csv")
for(i in 1:nrow(carpark_rates)){
if (carpark_rates$weekdays_rate_2[i] == "-"){
carpark_rates$weekdays_rate_2[i] <- carpark_rates$weekdays_rate_1[i]
}
if (carpark_rates$saturday_rate[i] == "-"){
carpark_rates$saturday_rate[i] <- carpark_rates$weekdays_rate_2[i]
}
if(carpark_rates$sunday_publicholiday_rate[i] == "-"){
carpark_rates$sunday_publicholiday_rate[i] <- carpark_rates$saturday_rate[i]
}
}
for (i in 1:nrow(carpark_rates)){
if (carpark_rates$saturday_rate[i] == "Same as wkdays"){
carpark_rates$saturday_rate[i] <- paste(carpark_rates$weekdays_rate_1[i],
carpark_rates$weekdays_rate_2[i], sep = ", ")
}
if (carpark_rates$sunday_publicholiday_rate[i] == "Same as wkdays"){
carpark_rates$sunday_publicholiday_rate[i] <- paste(carpark_rates$weekdays_rate_1[i],
carpark_rates$weekdays_rate_2[i], sep = ", ")
}
}
for (i in 1:nrow(carpark_rates)){
if (carpark_rates$sunday_publicholiday_rate[i] == "Same as Saturday"){
carpark_rates$sunday_publicholiday_rate[i] <- carpark_rates$saturday_rate[i]
}
}
carpark_rates <- data.frame(carpark_rates, address = paste("Singapore", carpark_rates$carpark)) %>%
mutate_geocode(address) %>% select(-address)
carpark_rates <- carpark_rates[!duplicated(carpark_rates$carpark),]
# Add in Bugis+ info that is not in original dataset
tmp <- data.frame(carpark="Bugis+", category="Capitaland",
weekdays_rate_1="8.00am - 5.59pm: 1st hour @ $1.28. Subsequent 15 mins or part thereof @ $0.54",
weekdays_rate_2="Mon-Thu: 6.00pm - 7.59am (next morning): $3.21 per entry. Fri: 6.00pm - 11.59pm: 1st 2 hour @ $3.21. Subsequent 15 mins or part thereof @ $0.54",
saturday_rate="12.00am - 11.59pm: 1st 2 hour @ $3.21. Subsequent 15 mins or part thereof @ $0.54",
sunday_publicholiday_rate="12.00am - 11.59pm: 1st 2 hour @ $3.21. Subsequent 15 mins or part thereof @ $0.54",
address="Singapore Bugis+"
)
tmp <- mutate_geocode(tmp, address) %>% select(-address)
carpark_rates <- rbind(carpark_rates, tmp)
write.csv(carpark_rates, "data/data-processed/carpark_rates.csv", row.names = FALSE)
condominiums.csv
url <- "https://www.srx.com.sg/condo/search/params?selectedDistrictIds=1,2,3,4,5,6,7,8,9,10,11,21,12,13,14,15,16,17,18,19,20,22,23,24,25,26,27,28&maxResults=20"
total_pages <- read_html(url) %>% html_nodes("strong:nth-child(5)") %>% html_text()
condo <- c()
for (i in 1:total_pages) {
page <- read_html(paste0(url, "&page=", i))
names <- page %>% html_nodes(".condo-result-name") %>% html_text()
names <- names %>% gsub("\\s\\(.*?\\)", "", .) %>% gsub("\\s•.*", "", .) %>% gsub("\n|\t", "", .)
condo <- c(condo, names)
}
condo_df <- data.frame(name=condo, address=paste("Singapore", condo)) %>% mutate_geocode(address) %>% select(-address) %>% na.omit()
condo_df <- condo_df %>% filter(lat>=1 & lat<=1.5 & lon>=103 & lon<=105)
write.csv(condo_df, "data/data-processed/condominiums.csv", row.names = F)
hawker_centers.csv
data <- read.csv("data/data-original/list-of-government-markets-hawker-centres.csv")
names(data)[1] <- "name"
data <- data %>% mutate(address=paste("Singapore", name, ",", location_of_centre)) %>% mutate_geocode(address) %>% select(c(name, lon, lat)) %>% na.omit()
write.csv(data, "data/data-processed/hawker_centers.csv", row.names = F)
hdb_info.csv
hdb_info <- read.csv("data/data-original/hdb-carpark-information.csv")
my_sf_df <- st_as_sf(hdb_info, coords = c("x_coord", "y_coord"), crs = 3414)
my_latlon_df <- st_transform(my_sf_df, crs = 4326)
hdb_info <- my_latlon_df %>%
mutate(lon = st_coordinates(my_latlon_df)[,1], lat = st_coordinates(my_latlon_df)[,2])
final_hdb <- hdb_info %>% as.data.frame() %>% select(1:10,12:13)
for(i in 1:nrow(final_hdb)){
final_hdb$address[i] <- str_to_title(final_hdb$address[i])
final_hdb$car_park_type[i] <- str_to_title(final_hdb$car_park_type[i])
final_hdb$type_of_parking_system[i] <- str_to_title(final_hdb$type_of_parking_system[i])
final_hdb$short_term_parking[i] <- str_to_title(final_hdb$short_term_parking[i])
final_hdb$free_parking[i] <- str_to_title(final_hdb$free_parking[i])
final_hdb$night_parking[i] <- str_to_title(final_hdb$night_parking[i])
}
final_hdb$free_parking <- gsub("Ph Fr", "PH from", final_hdb$free_parking)
write.csv(final_hdb, "data/data-processed/hdb_info.csv", row.names = FALSE)
# Assign area names and ids to HDB carparks
hdbcarparks <- read.csv("data/data-processed/hdb_info.csv", stringsAsFactors = FALSE)
# Historical weather API
query_date_time <- Sys.time() %>% substr(1, 19) %>% gsub(" ", "T", .)
url <- paste0("https://api.data.gov.sg/v1/environment/rainfall?datetime=",
query_date_time)
data <- fromJSON(url)
weather1 <- as.data.frame(data$metadata$stations) %>% select(id, location)
weather2 <- as.data.frame(data$items$readings) %>% select(value)
data_weather1 <- cbind(weather1, weather2)
# Weather forecast API
url <- paste0("https://api.data.gov.sg/v1/environment/2-hour-weather-forecast?date_time=",
query_date_time)
data <- fromJSON(url)
weather1 <- as.data.frame(data$items$forecasts)
weather2 <- as.data.frame(data$area_metadata$label_location)
data_weather2 <- cbind(weather1, weather2)
# Assign area_ids to carparks
areas <- data_weather1 %>% select(id, location)
areas$location <- areas$location[, c(2,1)]
area_col <- c()
for (n in 1:nrow(hdbcarparks)) {
carpark_coord <- hdbcarparks[n, c("lon", "lat")]
min_idx <- 0
min_dist <- 999999999999
for (i in 1:nrow(areas)) {
dist <- distm(areas[i, 2], carpark_coord, fun=distGeo)[1,]
if (dist < min_dist) {
min_dist <- dist
min_idx <- i
}
}
area_col <- c(area_col, areas$id[min_idx])
}
hdbcarparks$area_id <- area_col
# Assign area_names to carparks
areas <- data_weather2 %>% select(area, longitude, latitude)
area_col <- c()
for (n in 1:nrow(hdbcarparks)) {
carpark_coord <- hdbcarparks[n, c("lon", "lat")]
min_idx <- 0
min_dist <- 999999999999
for (i in 1:nrow(areas)) {
dist <- distm(areas[i, c(2,3)], carpark_coord, fun=distGeo)[1,]
if (dist < min_dist) {
min_dist <- dist
min_idx <- i
}
}
area_col <- c(area_col, areas$area[min_idx])
}
hdbcarparks$area <- area_col
write.csv(hdbcarparks, "data/data-processed/hdb_info.csv", row.names = F)
hdb.csv
hdb <- read.csv("data/data-original/hdb-property-information.csv") %>% mutate(name=paste0("HDB Block ", blk_no, ", ", street)) %>% select(name)
hdb <- hdb %>% mutate(address=paste("Singapore", name)) %>% mutate_geocode(address) %>% select(-address) %>% na.omit()
write.csv(hdb, "data/data-processed/hdb.csv", row.names = F)
hospitals_clinics.csv
url <- "http://www.hospitals.sg/hospitals#community-hospitals"
hospitals <- read_html(url) %>% html_nodes("h3+ ul a") %>% html_text()
hospitals_df <- data.frame(name=hospitals, address=paste("Singapore", hospitals)) %>% mutate_geocode(address) %>% select(-address)
clinics <- read.csv("data/data-original/chas-clinics-kml.csv") %>% select(c(HCI_NAME, X, Y))
names(clinics) <- c("name", "lon", "lat")
hospitals_clinics <- rbind(hospitals_df, clinics)
write.csv(hospitals_clinics, "data/data-processed/hospitals_clinics.csv", row.names = F)
hotels.csv
hotels <- read.csv("data/data-original/hotels-kml.csv") %>% select(c(Name, X, Y))
names(hotels) <- c("name", "lon", "lat")
write.csv(hotels, "data/data-processed/hotels.csv", row.names = F)
malls.csv
url <- "https://en.wikipedia.org/wiki/List_of_shopping_malls_in_Singapore"
malls <- read_html(url) %>% html_nodes(".column-width li , h2+ ul li") %>% html_text()
malls_df <- data.frame(name=malls, address=paste("Singapore", malls)) %>% mutate_geocode(address) %>% select(-address)
write.csv(malls_df, "data/data-processed/malls.csv", row.names = F)
schools.csv
primary_sec <- read.csv("data/data-original/general-information-of-schools.csv") %>% select(c(school_name, address))
names(primary_sec)[1] <- "name"
primary_sec <- primary_sec %>% mutate(address=paste("Singapore", name, ",", address))
primary_sec <- primary_sec %>% mutate_geocode(address) %>% select(-address) %>% na.omit()
polytechnic_page <- read_html("https://en.wikipedia.org/wiki/Category:Polytechnics_in_Singapore")
polytechnic <- polytechnic_page %>% html_nodes("#mw-subcategories a , #mw-pages .mw-content-ltr a") %>% html_text()
polytechnic_df <- data.frame(name=polytechnic, address=paste("Singapore", polytechnic)) %>% mutate_geocode(address) %>% select(-address)
university_page <- read_html("https://en.wikipedia.org/wiki/List_of_universities_in_Singapore")
university <- university_page %>% html_nodes("p+ ul li , td:nth-child(1) > a") %>% html_text()
university_df <- data.frame(name=university, address=paste("Singapore", university)) %>% mutate_geocode(address) %>% select(-address)
schools <- rbind(primary_sec, polytechnic_df, university_df)
write.csv(schools, "data/data-processed/schools.csv", row.names = F)
sport_facilities.csv
data1 <- read.csv("data/data-original/AQUATICSG.csv") %>% select(c(Name, X, Y))
names(data1) <- c("name", "lon", "lat")
data2 <- read.csv("data/data-original/PLAYSG.csv") %>% filter(grepl("sport", Name, ignore.case=T)) %>% select(c(Name, X, Y))
names(data2) <- c("name", "lon", "lat")
sport <- rbind(data1, data2)
write.csv(sport, "data/data-processed/sport_facilities.csv", row.names = F)
supermarkets.csv
supermarkets <- read.csv("data/data-original/supermarkets-kml.csv") %>% mutate(LIC_NAME=paste(LIC_NAME, STR_NAME)) %>% select(c(LIC_NAME, X, Y))
names(supermarkets) <- c("name", "lon", "lat")
supermarkets <- supermarkets %>% mutate(name=gsub("PTE. LTD. |PTE LTD |CO-OPERATIVE LTD |PRIVATE LIMITED ", "", name))
write.csv(supermarkets, "data/data-processed/supermarkets.csv", row.names = F)
locations.csv
# Combine all the location categories (from datasets above) into one dataset
attractions <- read.csv("data/data-processed/attractions.csv") %>% mutate(category="Tourist Attractions")
condominiums <- read.csv("data/data-processed/condominiums.csv") %>% mutate(category="Condominiums")
hawker_centers <- read.csv("data/data-processed/hawker_centers.csv") %>% mutate(category="Hawker Centers")
hdb <- read.csv("data/data-processed/hdb.csv") %>% mutate(category="HDB Flats")
hospitals_clinics <- read.csv("data/data-processed/hospitals_clinics.csv") %>% mutate(category="Hospitals & Clinics")
hotels <- read.csv("data/data-processed/hotels.csv") %>% mutate(category="Hotels")
malls <- read.csv("data/data-processed/malls.csv") %>% mutate(category="Shopping Malls")
mrt_lrt <- read.csv("data/data-processed/mrt_lrt.csv") %>% mutate(category="MRT/LRT Stations")
bus <- read.csv("data/data-processed/bus.csv") %>% mutate(category="Bus Stations")
schools <- read.csv("data/data-processed/schools.csv") %>% mutate(category="Schools")
sport_facilities <- read.csv("data/data-processed/sport_facilities.csv") %>% mutate(category="Sports Facilities")
supermarkets <- read.csv("data/data-processed/supermarkets.csv") %>% mutate(category="Supermarkets")
locations <- rbind(attractions, condominiums, hawker_centers, hdb,
hospitals_clinics, hotels, malls, mrt_lrt, bus,
schools, sport_facilities, supermarkets)
write.csv(locations, "data/data-processed/locations.csv", row.names = F)